Acute myeloid leukemia (AML) is a malignancy of hematopoietic stem and progenitor cells (HSPCs), characterized by the unchecked proliferation of myeloid blasts. Emerging evidence suggests that the leukemic microenvironment plays a role in treatment resistance; however, current high-throughput in vitro drug screening methods fail to model these interactions accurately. The zebrafish embryo is an established model system for studying HSPCs in their native microenvironment during early development. Zebrafish are amenable to chemical genetic screening due to their high fecundity, rapid growth, optical characteristics, and genetic similarity to mammals. However, the manual quantification of HSPCs in Runx1:mCherry HSPC reporter zebrafish embryo images is time-intensive, subjective, and a bottleneck in large-scale screening.

To address this, we developed an automated deep learning image analysis pipeline that supports both high-throughput chemical screening in zebrafish embryos and streamlined quantification of HSPCs from smaller-scale imaging experiments commonly performed in our lab and others. These images vary in size and source, with the chemical screening images captured via a Perkin-Elmer Opera Phenix Plus Confocal Imaging System at an average resolution of 1806 × 1753 pixels (±449 × 454) and in-lab general experiments imaged with a Keyence BZ-X710 Fluorescence Microscope at a resolution of 640 × 480 pixels. Preliminary analysis demonstrated that common general cell segmentation algorithms, such as Cellpose and StarDist, could not accurately identify zebrafish HSPCs. These approaches are not suited for the unique class imbalance of these images, with the ratio of cells to background exceeding 7500:1 in the chemical screening images. To mitigate this limitation, we implemented a two-step segmentation approach with an initial U-Net CNN model that localizes and masks the zebrafish caudal hematopoietic tissue (CHT), the functional equivalent to the mammalian fetal liver, where hematopoiesis occurs in zebrafish embryos. This model accurately identifies the CHT in validation images from the chemical screening with a DICE similarity coefficient exceeding 94%.

Using precise CHT localization, we consistently align this region across samples before training a secondary U-Net-based model on overlapping cropped tiles of the CHT to detect HSPCs. By creating augmented copies of many tiles across images, this method enabled the creation of 128x128 pixel training patches from varied source images. Model performance improved with the addition of weight maps that penalized errors in cell-dense areas. After evaluating multiple U-Net-based architectures, including small U-Net, Attention U-Net, and R2U-Net, a standard U-Net combined with the custom weight map achieved the best performance for semantic segmentation in identifying whether a given pixel corresponds to a cell (DICE = 78%). To convert binary classification to cell instance labeling, we tested several post-processing approaches using ground truth instance labels. Connected component analysis followed by a watershed algorithm yielded high specificity, achieving, at a threshold of intersection over union (I.O.U) ≥ 0.5, an average precision (AP) of 0.945 and an F1 score of 0.948 on chemical screening images.

By integrating these components, our model enables the detection and quantification of cells from large, information-sparse fluorescent microscopy images of zebrafish. A model trained on chemical-screening images achieved an average precision (AP) of 0.830 and an F1 score of 0.782 on full-size validation images at a threshold of I.O.U ≥ 0.5, which outperforms inter-human agreement between two experienced zebrafish scientists, who scored an average precision (AP) of 0.707 and an F1 score of 0.731 for the chemical screening images. A model trained using the same pipeline performed marginally worse than inter-human agreement on low-throughput images, with an average precision (AP) of 0.701 and an F1 score of 0.656, compared to inter-human agreement scores of AP = 0.780 and F1 = 0.768. This approach demonstrates the utility of combining multiple neural networks, complemented by programmatic image manipulation, to reduce time constraints for low- and high-throughput live cell imaging in the zebrafish.

This content is only available as a PDF.
Sign in via your Institution